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Abstract 


It is natural to relate partially ordered sets (posets in short) and 
classes of equivalent words over partially commutative alphabets. Their 
common graphical representation are Hasse diagrams. We investigate 
this relation in detail and propose an efficient online algorithm that 
decompresses a concurrent word to its Hasse diagram. The lexicograph- 
ically minimal representative of a trace (an equivalence class of words) 
is called its lexicographical normal form. We give an algorithm which 
enumerates, in the lexicographical order, all distinct traces identified 
by their lexicographical normal forms. The two presented algorithms 
are the main contribution of this paper. 

Keywords: poset, Hasse diagram, partially commutative alpha- 
bets, algorithms, generations 


Introduction 


Many practical problems related to partially ordered sets have a very high 
time complexity. Examples of such problems are the #P-complete problem 
of counting the number of posets linear extensions [1] or the NP-complete 
problem of computing the minimal number of jumps [3]. 

Among less complex problems one can mention a problem of computing 
the Hasse diagram of a poset (the transitive reduction of its graph) which 
has cubic time complexity. We consider a language-theoretic approach to 
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posets that uses words over partially commutative alphabets. It allows us to 
exploit the inner structure of a given poset to develop new algorithms. The 
complexity of these algorithms depends not only on the number of elements of 
a poset, but also on the complexity of its structure (the size of the concurrent 
alphabet used to represent the poset). The basic theory together with some 
algorithms can be found in and |5]. However, most of ideas presented 
there is based on the projection representation of traces which results O(nk) 
memory complexity. 


In the first section we give some basic notions related to the formal 
languages, partial orders and concurrency theories. In Section 2 we look more 
closely at the relation between words over partially commutative alphabets 
and posets. We analyse the dependence graphs of concurrent words and their 
relation to the Hasse diagrams of posets. We also summarise the situation 
when Hasse diagram has a special structure. Particularly, we show that every 
poset can be generated by the word over the partially commutative alphabet. 
Moreover we prove that P4-freeness of dependence relation of the concurrent 
alphabet guarantees N-freeness of the Hasse diagram. 


In the following section we deal with a decoding of the Hasse diagram 
from an arbitrary concurrent word and give an online algorithm for its 
construction. The presented algorithm works in time of O(nk?), where n 
denotes the size of the poset, and k the size of the alphabet. Note that the 
presented algorithm has memory complexity of O(k?). Together with the 
possibility of immediate output of partial results it allows us to process long 
words. 


The study of the properties of words over partially commutative al- 
phabets requires efficient tools for the enumeration of distinct classes of 
equivalent words (in the sense of the independence relation). We deal with 
this practical problem in the fourth section. Basically, we identify classes 
of equivalent words with their lexicographical normal forms [5]. Further, 
we show how to compute the considered representatives of all classes in the 
lexicographical order. For a given concurrent word (that is canonical), the 
single step of our algorithm computes the next (in the lexicographical order) 
word that is canonical. Moreover, if we consider possible blocks of identical 
letters instead of their individual occurrences, we can achieve the better time 
complexity of a single step. 


The preliminary version of this paper was published in Proceedings 


of Prague Stringology Conference 2011 ({13]). The new version is revised, 
contains some additional facts, proofs, new and extended examples, and 
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improved description of generation algorithm (with better time complexity). 


1 Basic Notions 


We use some basic notions of formal languages theory. By © we denote a finite 
set, called the alphabet. Elements of the alphabet are called letters. Words 
are sequences over the alphabet ©. The sets of all finite words is denoted by 
*, while by Alph(w) C © we denote the set of all letters contained in the 
word w. 

A concurrent alphabet is a pair (%,D), where © is an alphabet and 
DCx “is a reflexive and symmetric relation, called dependence relation. 
With dependence we associate, as another relation, an independence relation 
I =X x %\D. Having the concurrent alphabet we define a relation that 
identifies similar words. We say that a word o € »* is in relation =p with a 
word 7 € &* if there exists a finite sequence of commutations of subsequent 
and independent letters that leads from o to rT. Relation =pC %* x b* is 
a congruence (whenever it causes no confusion, relation symbol D will be 
omitted). 

To emphasise that considered word w € &* is over a concurrent alphabet 
(X, D) (an alphabet equipped with a dependence relation) we call it a partially 
commutative word. On the other hand, dividing the set %* by the relation 
= we get a quotient monoid. The elements of */= are often called traces 
(see |6} [11] [12]). This way, every partially commutative word o determines a 
trace a = [o]. 

Example. Let © = {a,b,c,d} and (©, D) be the concurrent alphabet, where 


D I 


while the independence relation is << 


— pb a b 
Cc d Cc 


aA—s 


The words abbaacd and abbcaad are equivalent. 

Note that dependence relation D is reflexive. However, here and through 
the paper loops in graphical representation of the relation are omitted. 

A partial order on the set X is a reflexive, antisymmetric and transitive 
relation < C X x X. If additionally every pair of elements from X is 
comparable, the relation <; is called the total order. A pair (X,<) is called 
the partially ordered set, (poset in short). Observe that in the case of a 
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totally ordered set (X,<;) elements of X form a sequence (denoted by we,). 
A linearisation of a partial order (X,<) is a sequence w<, for any total order 
(X,<,) containing (X,<), which means that < C <. 


With every poset we can associate its directed graph (digraph in short) 
G = (X,E). The vertices of G are elements of the poset. There is an arc 
between two vertices x,y € X ifa<y (ie. x < y but x #y). Such a graph 
is always acyclic. We can also define the Hasse diagram of the poset (X,<) 
as a transitive reduction of the graph G. More general, the graph of every 
relation on X which transitive closure is equal to < is called a diagram of <. 


Definition 1 Let G = (X,E) be an acyclic graph. The Hasse diagram of G 
is the acyclic graph H = (X,E' C E), such that an arc (x,y) € E" if there 
is no z€ X (different than x and y) for which there are both paths (in G) 
from x to z and from z to y. 


The example of a poset’s graph and its Hasse diagram is shown on 
Figure [I] We can observe that the size of the Hasse diagram is significantly 
smaller than the size of the poset’s graph. Therefore, Hasse diagrams can be 
seen as a compact representations of posets. Another efficient representation 
of a poset is discussed in the following section. 


Figure 1: The graph of an example poset. The dashed edges are not 
contained in its Hasse diagram. 
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2 From Partially Commutative Words to Posets 


With every word w over partially commutative alphabet (%,D) we can 
associate a poset. One of the diagrams of this poset is induced by the 
dependence graph of a word w. An element vu; associated with the letter 
w; is greater than an element v; associated with the letter w; if 7 < 7 and 
w;Dw;. The label of the element (vertex) v; is denoted by (vj) = w;. It is 
worth noting that two words are equivalent if and only if their dependence 
graphs are the same (isomorphic and respecting labelling). 

By the definition of a diagram, reflexive transitive closure of the depen- 
dence graph of a word is basically a graph of a poset associated with the 
word. Additionally, transitive reduction of this dependence graph is exactly 
the Hasse diagram of the considered poset, see Figure [2] 


Remark 1 For an arbitrary concurrent word, its Hasse diagram represen- 
tation is unique. On the other hand, two different words over the same 
concurrent alphabet can lead to the same Hasse diagram structure (without 
taking into account the labelling of the nodes). 


=f a.b,e.d} DS 


Figure 2: A concurrent alphabet (%,D), dependence graph and Hasse 
diagram of word abbacad over that alphabet. 
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Lemma 1 Every finite poset (P,<) can be generated by a word over concur- 
rent alphabet. 


Proof: For a given finite poset (P,<), let us define a concurrent alphabet 
(i, D) in such a way that © = P and p; Dpy if and only if p; < pg or po < pi. 
An arbitrary linearisation of the poset (P,<) corresponds in a natural way 
with a word v € &* which generates a poset equal to (P, <). 


The above observations allow us to represent every poset in a compressed 
way by a pair consisting of concurrent alphabet and a single word over that 
alphabet. In the next section we will provide an efficient algorithm that 
produces a Hasse diagram by decompressing a given word to its associated 
poset. 

Further optimisation, possible only for Hasse diagrams which are minimal 
series-parallel graphs [18], leads us to another data structure which can be 
used to solve many problems in a simpler way (for instance, the #+P-complete 
problem of counting the number of linear expansions [I] can be solved in a 
linear time for such posets). In what follows, by a sink of a directed graph 
we mean any vertex v; that has no outgoing arc, while by a source we mean 
any vertex v2 that has no ingoing arc. Note that every acyclic graph has at 
least one source and one sink, while a path may be considered as a special 
kind of graph with all vertices having at most one ingoing and at most one 
outgoing arc, and exactly one sink and one source. 


Definition 2 A minimal Series-Parallel digraph (MSP) is a graph consisting 
of a single vertex and no arcs or is constructed from two disjoint MSPs — 
Gi = (Vi, £1) and G2 = (V2, E2) — by the following operations: 


e Parallel composition: Gp = (Vi U Va, FE U E2); 
e Serial composition: Gp = (Vi U V2, £1 U E2 UT; x $2); 


where T, is the set of sinks of G1 and S is a set of sources of Go. In other 
words, series-parallel graphs can be represented as an expression built by 
series and parallel composition of graphs with single-vertex graphs as atoms. 


The example of the graphical representation of the composition opera- 
tions is shown on Figure 

The properties of series-parallel graphs are deeply studied (see for 
instance [18]). A very useful determinant for sequential parallel graphs 
is their N-freeness [17]. 
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a—d 
| | acbaddbaacabd 
b— ec 
ag > ag > aj] | bi2 
XQ 
> C 
[ 10 dy3 


Figure 3: The dependent alphabet D, the word w and its Hasse diagram 
divided to series-parallel blocks. 


Definition 3 An N-poset is a poset consisting of four elements a,b, c,d 
with relations a < c,b < c and b < d (drawing a graph of such poset with 
greater elements higher brings to mind capital letter N ).[10/ 


Definition 4 A poset is N-free if its graph does not contain an induced 
subgraph isomorphic with Hasse diagram of N-poset. 


Remark 2 In the case of undirected graphs, analogue is P4-free graph (a 
graph that does not contain an induced path of length 8). 


The example of the graphical interpretation of the above mentioned 
notions can be seen on Figure [4] 

In general, this type of graphs, also in the context of partial orders, 
is deeply studied (see [9] and the references therein). However, 


observation worth mentioning is the following: 


Lemma 2 If a dependence graph D of an alphabet % is P4-free then the 
Hasse diagram of every partially commutative word w € (&*, D) is N-free. 


Proof: We prove this lemma by contradiction. Let us suppose that there 
exists a word w over concurrent alphabet (©, D) with P4-free graph of 
relation D which Hasse diagram H(w) has induced digraph N = (Vp, En) of 
N shape. 
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Figure 4: N-poset, simple N-free poset and P4-free graph. 


PS Berek aig = h 


aA 


Ul =a 


Figure 5: Proof situation. 


Without loss of generality we can assume that graph N is the first graph 
in Figure [4] It means that V,, = {v1, v2, v3, v4} and labels of these vertices 
are V1 = a, v2 = b,v3 = c, v4 = d. 

Let us consider the situation depicted in Figure 5h. We start from 
relation (7) and claim that letters a and b are independent. Indeed, otherwise, 
there has to be a path p in H(w) between vertices vj and v2. Let us suppose 
that v; is source of path p. Then there is a path from v; to v3, so there 
should not be an arc (v1, v3) in Hasse diagram. 

We proceed by deducing that letters c and d (relation (ii)) are also 


independent. Otherwise, there is a path p between vertices v3 and v4 in 
graph H(w). If v3 is a source of path p then arc (v2, v4) should not be present 


Algorithmics of Posets Generated by Words Over 
Partially Commutative Alphabets (Extended Version) 237 


in H(w). If vg is a source of path p then arc (v2, v3) should not be present 
in H(w). 

The relations alb and cld shows that ¢(v1) 4 &(v2) and &(v3) A C(vs). 
We also know that bDc so ¢(v,) 4 (v3) because one letter can not be at 
once dependent and independent with another. For similar reasons b 4 d 
and bc. 

Now we consider the relation (iii) from Figure|5p. Firstly let us suppose 
that ald. Then also a 4 d and we have a subalphabet {a, b,c,d} C © with a 
dependence graph of shape P4. It is in contradiction with the assumption 
that D is P4-free. 

The last situation to consider is aDd. Then there should be a path p 
between v, and v4. If the vertex v4 is a source of path p then we have a 
path form v2 to vg of length greater than 1, so the arc (v2, v3) should not be 
present in graph H(w). Let us suppose that vertice v; is a source of path p. 
Let the first arc of path p be (v1, v5) and the label of vs is denoted by e (see 
Figure |5p). Then, the letter e is independent with c (otherwise, one of arcs 
(v1, v3) or (v1, v5) should not be present in H(w)) and independent with b 
(otherwise, one of arcs (v1, v3) or (v2,v4) should not be present in H(w)). 
It means that (us) 4 &(v2), &(us) A (v3) and (us) A (v1), so we have a 
subalphabet {a, b,c,e} C S with a dependence graph of shape P4. It is in con- 
tradiction with the assumption that D is P4-free and the proof is complete. 


3 Construction of Hasse Diagram 


This section is devoted to the problem of constructing the Hasse diagram 
(see Definition [1) for an arbitrary concurrent word. At the beginning, we 
give an algorithm and its pseudo-code. After that, we discuss the complexity 
of our solution. 

The algorithm exploits the knowledge of the structure of resulting 
diagram. We can summarise it in the following facts: 


Lemma 3 Let w € %* be a word and H(w) = (V, Ex) be a Hasse diagram 
of w. If there exists the arc connecting vertices v; and v; (labelled a = w; and 
b = w; respectively) then letters a and b do not appear in word w between 
indexes 1 and j. 


Proof: Let G = (V,F) be the dependence graph of the word w over the 
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concurrent alphabet (1, D). The existence of an arc between v; and v; in 
graph H implies that there is also an arc in the graph G, hence letters a and 
b are dependent (formally aDb). Let us suppose that there exists a letter 
c = l(uz) (for i << k < 7) that is dependent both with a and b. Then, by 
Definition [1] there is a path in graph G between vertices vj and vu; of length 
greater than one, so there is no arc between v; and v; in graph H(w), which 


leads to a contradiction, and completes the proof. 


Lemma 4 Let w € &* be a word and H(w) be a Hasse diagram of w. Then, 
in H(w), for each vertex there are no more than k = |X| outgoing arcs and 
no more than k ingoing arcs. 


Proof: Let G = (V,E) be the dependence graph of the word w over concur- 
rent alphabet (%, D). Let us suppose that there is a vertex v; which has k+1 
outgoing arcs. There are k letters in alphabet %, so two of these outgoing 
arcs lead to two distinct vertices vj and vz (i < j < k) labelled with the 
same letter. Without loss of generality we can assume that ¢(v;) = a and 
£(uj) = &(wz) = 6. From Lemma |3] there is no arc in graph H(w) between 
vertices v; and vz, which proves that there are at most k outgoing arcs. 
Similar reasoning allows us to prove the second part of the lemma on the 
number of ingoing arcs. 


Lemma 5 Let w € &* be a word and H(w) = (V,E) be a Hasse diagram 
of w. Ingoing arcs of a given vertex v; are fully determined by the vertices 
associated with last occurrences of letters dependent with €(v;). More formally, 
(uj,ui) € E if and only if j <i and l(v;)De(v;) and there is no verter vz, 
such that €(vg)D&(u;) and j < k <i and there is a path from v; to vp. 


Proof: Let (v;,v;), where £(v;) = a, be an arc in H(w). Lemma 3] implies 
that vj; must be the last occurrence of letter a in word w that precedes w;. 
Second part of the lemma follows directly from Definition [1] (see proof of the 
Lemma |3). 


Using foregoing observations we propose an additional structure that 
saves information about last occurrences of each letter processed so far. It 
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allows us to immediately add a new vertex to Hasse diagram, with all of 
its ingoing arcs. Our structure consists of a list of dependencies, a set of 
visibility (both of size at most k) and a pointer to the last occurrence, for 
each letter of the alphabet %. The list D, contains all letters dependent with 
a in LIFO (last in — first out) order of their last occurrence in the currently 
constructed part of the diagram. The set V, contains all letters b whose last 
occurrences are visible from the last vertex labelled with a. In other words, 
there exists a path from v; to v; where v; and v; are the last occurrences of 
letters €(v;) and €(v;) in hitherto diagram. Such elements vu; will be called 
sources of vj. The last element is a pointer Lg which is basically a pointer 
to the last vertex labelled with the letter a in processed diagram. We will 
also use a temporary set V. 


Before we start generating Hasse diagram, we set all pointers to null 
and all sets to be empty. The lists of dependencies should be complete with 
all dependent letters, but the initial order does not matter. With such data 
we are ready to process a new letter a of a word w in online manner, updating 
the proposed structure after each step and creating a new vertex and new 
arcs. During the addition of the new vertex labelled with letter a we clear 
set V and browse the list D,. For each letter b from that list we check if the 
pointer Ly is not empty and if b does not belong to V (its last occurrence is 
not already visible from the new vertex). If we succeed, we add a new arc 
from the vertex vy pointed by Ly to the newly created vertex. Addition of 
a new arc implies that there is also a path from every source of vp to the 
recently created vertex. Therefore, we add set V, to our temporary set V. It 
is worth noting that the order of processing letters form list Dg is important 
because of the dynamically changing set V. 


After adding new arcs, we have to update our structure. Firstly, we 
remove the letter a from each set V; — the new vertex is now the last occurrence 
of letter a. Next, we switch the position of the letter a in every list Dy — the 
letter a is the most recent letter now. The last operation is the update of the 
set V, to V Ua and pointer L, to the position of the new vertex. Note that 
in the rest of the generation process we need only the most recent vertex 
labelled with a and we do not have to store other vertices labelled with the 
same letter. 


The correctness of the algorithm presented above relies on lemmas 
formulated at the beginning of this section. Let us discuss the memory and 
time complexity of our solution. The proposed data structure consists of k 
lists D of at most k items each. It gives us k? elements. The k sets V can 
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Algorithm 1: Hasse diagram 


1 Input: a word w = w w2::: Wp over a concurrent alphabet (1, D) 
2 Output: a graph G representing Hasse diagram 


3 foreach a € ©} do 
4 | La :=0; Va := 9; 


5 for i:=1tondo 

6 a:=w; V :=9; 

7 foreach 6b € Dg, in order of the last occurrence do 
8 if Ly 40 and b¢ V then 

9 


Insert an arc Wr, — Ww; into G; 
V:=VUY;; 


11 foreach b € % do 


12 | Vp = Ve/{a}; 
13 foreach 6 € D, do 
14 Move a to the beginning D,; 


15 Va = V Ua; Lg := 1; 


be implemented using O(k”) memory, we also need k pointers L. Summing 
up, the most significant part of this data structure is a set of lists and the 
memory complexity is O(k?). 


The presented algorithm is online, which gives a linear factor in time 
complexity. Let us analyse a single step of extending the diagram with a new 
vertex (processing a new letter). We can see there a sequence of three loops. 
The first one is the most significant. We have to compute at most & sums of 
subsets of set ©. It gives us a factor k?. Every of k operations in the second 
loop (line 11) can be done in constant time. Furthermore, the operations in 
last loop (line 13) has logarithmic time complexity if we make use of priority 
queue but it can be implemented in constant time. Summarising, we have 
a complexity of O(k?) for each step of algorithm that in total gives O(nk?) 
time complexity for processing the whole word. See Figure [6] for the detailed 
step by step example of the Hasse diagram generation. 
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Current |Dependence | Visibility Pointers| Hasse 
letter | lists sets diagram 
Da = (a,b, d)|Va = 0 L,=0 
Dy = (b, a,c) |Yj = 0 y= 0 
De= (66,0) Ve =O be=0 
De=(d.4,0|\G=0 Ly=0 
Dz = (a,b, d)|Va = {a} Lael 
2 Dy = (6; b;6) |Ve =0 Ip =0} ay 
D. = (c,b,d) |V. = 0 be=0 
De= (ade 0 La=0 
Dz, = (d,a,b)|Va = {a} Lael 
a Dy = (b, a,c) |Y = 9 Ly = 0: | GS-5 
De. = (d,c,b) |Ve = 0 Le=0 
Dg = (d,a,c)|Vq = {a, d} Lg=2 
D, = (b, a; d)| Vo = {a} ba 1 
adb Dy = (b, a,c) |Vp = {a, b} Ly =3 ele aa 
De = (6,0) Ve=0 Le=0 
Die= (dya;6)|Vi= ta, a} Lge=2 
Dg = (a,b, d)|Va = {a} Let , CG, : 
fiir Dy = (c,b, a) |V, = {a, b} Eg 8 |e es D8 
De= 1650, 4). Ve =fes0,c,d}) Le=4 
De=(ed,0)|Va=ta,a\ bead 
Di = (b,ayd)|\Ve= 4a} 1 open | Pee a eee: 
adbeb Dp = (0, a,c) |Vp = {a,6,c,d}| Lp = 5 | Oa es 5 
De = (6,60) \Ve=H{G,6d) | Le=4 
Deg = (d-026) Vae= tasd} fys2 


Figure 6: Example of the Hasse diagram generation for the word adbcb using 
Algorithm 1. 


242 L. Mikulski, M. Piatkowski, S. Smyczynski 


4 Generation of All Disjoint Traces 


The problem with the compressed presentation of a poset discussed in the 
previous sections is that it is not unique, see Remark [I] For a given ordered 
concurrent alphabet (© = {a1 < ag < ... < ax},D) and a word w, every 
other word v equivalent with w represents the same poset. To overcome this 
disadvantage we can use the notion of lexicographic normal form [5]. Basically, 
from all the representatives we choose the lexicographically minimal one as a 
normal form. All words that are in such normal form are called canonical 
words. The natural problem that arises, is to enumerate all nonequivalent 
words (in fact lexicographic normal forms of traces) of length n for a given 
concurrent alphabet. In this section we deal with this problem. 

Let © = {ay < ag < ... < ag} be an ordered alphabet and X a 
set of words over ©. For a word w € X we define its X-successor as the 
lexicographically minimal word v € X such that v 4 w and w < v. 

The proposed algorithm is motivated by the well known SEPA algorithm, 
see [7] [8]. We consider a set X of lexicographically minimal representatives 
of all nonequivalent traces of length n. For a given word w we identify and 
modify only its working suffiz — the suffix of w which makes it different 
from its X-successor. We begin enumeration with lexicographically minimal 
word w = a,a,...a,. Then, we consecutively modify the current word to 
its X-successor. The correctness of proposed procedure follows from several 
corollaries to the following fact: 


Proposition 1 Let (©, D) be a concurrent alphabet and < be a linear or- 
dering of &. Then, a word w € >* is the lexicographic normal form of a 
trace over (1, D) (is canonical) if and only if for each factor aub of w with 
a,bEX, ue &*, Veeaipn(au)(c, b) € I it holds a < b. 


The proof of Proposition [1] can be found in [4]. For self consistency we 
include the following corollaries equipped with independent proofs. 


Corollary 1 If wv is a canonical word then both words w and v also are 
canonical. In other words prefizes and suffixes of canonical words are canon- 
ical. 


Proof: Observe that each factor of w is also a factor of wv, hence by 
aes each factor aub of w with a,b € U, u € &*, Vee Atph(au)(e, 6) € L 
it holds a < 6. Using Proposition [1] we achieve that w is canonical. 

Note that the same arguments can be applied to any factor of wv, 
particularly for v. 
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Corollary 2 In every canonical word w if there exists i such that letters w; 
and w;41 are independent then w; < wy41- 


Proof: We apply Proposition [1] for U=E. 


Corollary 3 If there exists a substring wjwj41...wj—-1w; of canonical word 
w such that letter w; is independent with all letters w;, wi41,-..,Wj—1 then 
w,; 1s the maximal amongst these letters. More precisely, 


Viefiit1,...j-1} Wj > We 
Proof: We apply Proposition [i] for WWi41...wj_-1w; fori <1 < 7. 


Definition 5 Leta c€ » be a letter. By C” we denote the set of all canonical 
words of length n which start with the letter a. 


It is an easy observation that the set C”’ is nonempty. It contains at 
least the word a”. Moreover, Cl = {a}. 


Lemma 6 Let w; € » be an arbitrary but fixed letter and w = wywo... Wn 
be the lexicographically smallest word from C%,, (forn > 1). Then the letter 
wg is the smallest letter dependent with the letter w, and the word wo...Wny 
is the lexicographically smallest word from CT Moreover, the sequence of 
letters w1, W2,...,Wn 1s nonincreasing and every two consecutive letters from 
this sequence are dependent. 


Proof: We give the proof by induction on the length n. 

Let w € CF es Then w is of the form w,we2, where w2 is dependent with 
wy or strictly greater then w,. Therefore, the smallest element of Cc. is the 
word w1w2, where wy is the smallest letter dependent with w, (maybe wy 
itself). Other parts of the lemma are clearly satisfied. 

Let us suppose that the lemma holds for all letters and lengths smaller 
than k. We prove the case of letter w; and length k. Let us suppose, that 
word w = w,w2...W, is the lexicographically smallest word from CFs Then 
the letter we is (similarly to the case of length 2) dependent with w, and not 
greater than w,. Moreover, from Corollary [I] the word w2... Wz is canonical. 
If it were not the smallest word from the set Ce. we could change it to 
the word of such property achieving better candidate for minimum, and the 
proof is complete. 


244 L. Mikulski, M. Piatkowski, S. Smyczynski 


The foregoing facts provide us enough information on the structure of 
the canonical words to design the algorithm transforming a given canonical 
word w into its X-successor. The algorithm consists of three steps: 


1. Finding the last index 7 such that w; 4 az. We know that index i is 
the starting position of the working suffix. 


2. Computing the minimal letter a greater than w; such that w,we...wj_1a 
is canonical. It is implied by Corollary [1] 


3. Generating the rest of the working suffix to obtain the minimal canonical 
word that starts from the letter a (at position 7). 


To implement the second step we introduce an oracle V:1...nxu— %. 
For every position i and every letter a the V(i,a) = Vji(a) answers to 
the question - is there a substring wjwj+1...wj—-1 such that all letters 
Wj,Wj+41,--.Wi-1 are independent from a and at least one letter from this 
substring is greater than a? In the case of positive answer, V;(a) gives 
the maximal witness (the maximal letter from all substring of considered 
property), otherwise it simply returns a. Such an oracle can be constructed 
in linear time (with respect to n) using the following formula: 


Vi(a) = a 


Via) = a : aDwj_i 
: 7 max(w;-1,Vi-1(a)) : otherwise 


For every letter a such that V;(a) = a, the string w,w2...w;—1a is canonical. 

For the efficient generation of the working suffix in step three we use a 
precomputed table Dyin such that Vaex Dmin(a) = min{b € ©: aDb}. 

After generating a new canonical word, we have to update the oracle V. 
The value of V;(a) depends only on Vj—1(a) and letter w;—1. Therefore, we 
only have to update oracle from V;+1 to V, (for the whole working suffix). 
Moreover, if there exists such an index I in the working suffix that w; = wi41, 
then the rest of the suffix is constant (all foregoing letters are equal to w7) 
and computation of missing oracle values are trivial (Vij2 = Viz3 =... = 
Vi, = Vii). The example of the enumeration process described above is 
shown in Figure 

The canonical word abeceeee is transformed into the next canonical word 
abeecbbb in the lexicographic order. The first letter of the working suffix c is 
changed to e; it cannot be increased to d due to the oracle V4(d) 4 d. Then 
the rest of the suffix is generated using Din table. Dmin(e) = c, Dmin(c) = 6 
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a 
D... = b}/cidje 
» = {a, }, C, d, e} D = e < : min — a b b alc 
d—c 
Working suffix Oracle’s update 
abeceeee abeecob b b 
Vi V2 V3 Va Vs Ve Vr Vo Vi V2 V3 Va Vs Ve Vr Vo 
ajaljajlale/jele ajajajlalele|lele 
b}b| bi b]b]ble b}/b}|b}|b]}b b|b 
c|c|]c}le}lele]clvwl|W — c{|c|l|c}le}leje}lelecly 
djd|/d|d/ej/dje dj)d|/d|d/e/e|djid 
e/elele/jejele ej/el/elelele|lele 


Figure 7: The example of X-successor computation. 


and Dmin(b) = b. Finally the working suffix ceeee is transformed into ecbbb. 
The oracle V5, Vg, V7, Vg is updated afterwards. 

The observations mentioned above lead us to the Algorithms 2 and 3. Let 
us discuss their memory and time complexity. The used memory is obviously 
O(nk), mostly used for oracle V. The time complexity of steps needed for 
generating the next canonical word depends on the length #SUF'F of the 
working suffix (lines from 6 to 13 of Algorithm 2). The line 6 is linear with 
respect to #SUF'F.. Loop in lines 7 — 9 perform at most k iterations. The 
next loop (lines 10 — 11), which generates a suffix, makes exactly #SUFF 
operations. The most complex work is done in the last loop, which updates 
the oracle. At most k times the execution of the procedure Update Oracle is 
nontrivial and computes whole V;. The rest of computation (at maximum 
#:SUFF times) will end up at line 4 of the Update Oracle procedure, which 
can by simply implemented as a reference copying. It gives O(k? + #SUFF) 
complexity of the last loop. 

Note that instead of reference copying we can make use of blocks of 
the same letters (like in Run Length Encoding compressed representation, 
see [14]). Such blocks may appear also when the algorithm changes the first 
element of the working suffix to the last letter of the preceding prefix. Such 
a solution needs more careful implementation but enables the reduction of 
the time complexity of a single step from O(k? + #SUFF) to O(k?). Note 
that it makes the time complexity of a single step independent of the length 
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Algorithm 2: Enumerate Canonical Words 
1 Input: w := aja,...a4; 
2 Output: succ(w); 


3 for i:=1tondo 
4 | Update Oracle Vj; 


5 repeat 

6 4 := last index such that w; € ax; 
repeat 

8 ty = suceiw,); 

9 until Vi(wi) = Wi; 


10 for 7 :=i+1tondo 
11 w; := Dmin(wj-1); // Generate suffix 


12 for j :=i1+1tondo 
13 Update Oracle Vj; 


14 OUTPUT w ; 
15 until w = apag... ag ; 


of the word. However, implementing this solution we can not forget that 
each non-singleton block needs two columns of the oracle. 

Let us recall the example presented on Figure Observe that the 
compressed version of working suffix is ce’, while the resulting word is 
abe?cb?. This way we avoid the problem of reference copying and filling the 
suffix with a constant repeating value (letter b in the example). 

If we set & as a constant enlarging only n, the time complexity of 
the single step of X-successor generation is O(#SUF'F), or O(1) in the 
compressed version, and therefore is optimal. Nevertheless, it would be very 
interesting to investigate the case when k is close to n. This case needs 
another kind of optimisation and new algorithms. 


5 Summary and Future Work 


In the paper we have discussed an approach to encode posets by strings. We 
have used concurrent alphabets and a well known notion of Hasse diagram, 
which might be significantly smaller than the graph of a poset. We have 
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Algorithm 3: Update Oracle V; 
1ifi=1 then 


2 | foreach ac ¥ do Vi(a) := a; 

3 else if 7 > 2 and w;_2 = u;_) then 

4 | Y= Vi-1; 

5 else 

6 foreach a € & do 

7 if aDwj_1 then 

8 | Vila) := a; 

9 else 
10 | Via) = max(wi-1, Vi-1(@)); 


shown that every poset can be represented by a pair consisting of a concurrent 
alphabet and a word over this alphabet. However, it is very interesting how to 
choose the best pair. The first criterion is the size of the concurrent alphabet 
(the one from the proof of Lemma [1] is taken in a very inefficient way). The 
second important property is preservation of N-freeness by achieving the 
P4-free dependence relation graph. 


In the third section we gave an efficient online algorithm that decompress 
a concurrent word into a Hasse diagram. It is worth to note that the 
concurrent word given as an input for our algorithm does not have to be 
in a normal form and may be very long, as we do not have to store neither 
entire word nor entire diagram (only a small piece of size O(k”)). Moreover, 
utilising additional data in Algorithm 1 we are able to implement an efficient 
algorithm for concatenation of Hasse diagrams (over the same concurrent 
alphabet). The study of similar constructions for star operation would be 
very interesting and shall lead to an efficient algebra of posets. Such a tool 
would be very useful for modelling systems based on partial orders. 


Section four is devoted to an algorithm which enumerates all nonequiv- 
alent strings (in the sense of dependence relation). The main idea is to 
construct an algorithm that is optimal (for constant size k of the alphabet) 
with respect to performed changes. We also present an idea of using well 
known compressed string representation (RLE), which results in obtaining 
the constant time complexity of a single step. The case of k close to n 
needs further work and new algorithms. Other possible directions for further 
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research is making use of run length encoding and considering only traces 
restricted to a fixed Parikh vector. 
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